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Abstract. Monte Carlo simulation with a-priori unknown weights have 
attracted recent attention and progress has been made in understand- 
ing (i) the technical feasibility of such simulations and (ii) classes of 
systems for which such simulations lead to major improvements over 
conventional Monte Carlo simulations. After briefly sketching the his- 
tory of multicanonical calculations and their range of application, a 
general introduction in the context of the statistical physics of the d- 
dimensional generalized Potts models is given. Multicanonical simu- 
lations yield canonical expectation values for a range of temperatures 
or any other parameter(s) for which appropriate weights can be con- 
structed. We shall address in some details the question how the mul- 
ticanonical weights are actually obtained. Subsequently miscellaneous 
topics related to the considered algorithms are reviewed. Then multi- 
canonical studies of first order phase transitions are discussed and finally 
applications to complex systems such as spin glasses and proteins. 



1 Introduction and Summary 

One of the questions which ought to be addressed before performing a large scale 
computer simulation is "What are suitable weight factors for the problem at hand?" 
Since the 1970s it has been expert wisdom that Monte Carlo (MC) simulations with 
a-priori unknown weight factors are feasible and deserve to be considered [78] . With 
focus on narrow classes of applications, this idea was occasionally re-discovered, for 
instance [4]. With the work of ref. [7, 8] it became a more widely accepted idea, 
which is employed for an increasing number of applications in Physics, Chemistry, 
Structural Biology and other areas [21]. 
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In 1988 the Italian APE^ collaboration [3] raised the question whether the 
SU{3) deconfining phase transition in lattice gauge theory is really of first order, 
as everyone believed, or possible of second order. The first order nature of the 
transition was ultimately confirmed and in our context it is only of interest that it 
was this question which triggered the theoretical high energy physics community to 
perform a number of numerical precision studies of first order phase transitions. In 
particular difficulties to calculate the interfacial tension were noted, as is reflected 
in the papers by Potvin and Rebbi [73], as well as Kajantie et al. [59], where new 
methods were introduced and tested for the^ 2d 7-state Potts model (and later 
applied to SU{3) lattice gauge theory). The author of this review was involved in a 
number of studies of the SU{3) deconfining transitions, which cumulated in ref.[l] 
with systematic applications and developments of re-weighting techniques [37] for 
SU{3) gauge theory. In this context the papers with Neuhaus [7, 8] originated 
where we introduced the multicanonical method and succeeded for the 2d 10-state 
Potts model to get accurate interface tension estimates out of Binder's [26] his- 
togram technique. Looking back, it is to some extent astonishing that we did not 
realize immediately the connection with Torrie and Valeau's [78] umbrella sam- 
pling techniques. However, Binder's histogram method (known to us by personal 
contacts) was by then ten years old and more or less dormant due to the problem 
of supercritical slowing down discussed below. It was just not a natural idea to 
search the literature under the assumption that everyone may have overlooked an 
existing numerical method that gets it working. The umbrella method had stayed 
confined to small expert circles. The major reason for this was, most likely, that 
non-experts could never figure out how to get the a-priori unknown weight factors 
in the first place. With respect to this point the multicanonical papers initiated 
progress beyond simply making the method popular. 

We are here concerned with MC simulations of the Gibbs canonical ensem- 
ble, see the next section for preliminaries. Multicanonical simulations calculate 
canonical expectation values in a temperature range, whereas conventional canon- 
ical simulations [66] calculate at a fixed temperature T and can, by re-weighting 
techniques, only be extrapolated to a vicinity of this temperature. Interest into 
re- weighting techniques, an idea which can be traced back to the late 1950's [75], 
exploded with the paper by Ferrenberg and Swendsen [37]. Apparently, the time 
was then right for the radical step of transgressing entirely beyond conventional, 
canonical MC simulations. Still, a lucky accident helped the rapid acceptance of 
the multicanonical method. Namely, for the 2d 7-state Potts model ref. [73, 59] 
had produced interface tension estimates which were by an entire order of magni- 
tude larger than the multicanonical estimates [56] for the same model. Normally, 
such numerical discrepancies are difiicult to resolve. However, for the 2d 5-state 
Potts model a miracle happened: Building on results of [62, 63, 32] the exact 2d 
Potts model interface tensions were derived [29] shortly after the simulations were 
completed. The exact results and the multicanonical estimates were found to be in 
excellent agreement and the previous controversy converted to a significant boost 
for the new method. 

In the application to calculations of interface tensions the emphasis is on en- 
hancing rare configurations. Canonical MC simulations [66] sample configurations 



^ Array processor with emulator, the collaboration named itself after their first dedicated 
computer. 

^Here and in the following d denotes the dimension of the considered system. 
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k with the Boltzmann^ weights 

WB{k) = wb{E^''^) = e-'''^"" (1.1) 

where E'^'^^ is the energy of configuration k, (3 = 1/T and units are chosen such that 
fcs = 1 holds for the Boltzmann constant. The resulting probability distribution 
for the energy is 

P{E) = wb{E) = Cfi n{E) e"''^ (1.2) 

where n{E) is the spectral density, i.e. the number of configurations of energy 
E. The normalization constant Cfj is needed to ensure ^ 1- ^ 

characterize the lattice size (for instance N = L'^ spins). For first order phase 
transitions pseudocritical points /3^(i) exist such that the energy distributions 
P{E) = P{E-.L) become double peaked and P'^{L) can be chosen such that the 
maxima at E^^ < E^^ are of equal height Pmax = -P(S,^„ax) = -P(^max)- In- 
between these values a minimum is located at some energy -Emin- Configurations 
at Ejnin are exponentially suppressed like 

Pmin = P{Emin) = Cf exp(-/M) (1.3) 

where is the interface tension and A is the minimal area between the phases, 
A = 2L'^~^ for an L'' lattice, c/ and p are constants, p = d—1 in. the capillary-wave 
approximation [31, 39, 70]. To determine the interface tension one has to calculate 
the quantities 

and make a finite size scaling (FSS) extrapolation of f^{L) for L — > oo. However, for 

large systems a canonical MC simulation (1.2) will practically never visit configu- 
rations at energy E = E'min and estimates of the ratio R{L) will be very inaccurate. 
The terminology supercritical slowing down was coined to characterize such an ex- 
ponential deterioration of simulation results with lattice size. Ref. [8] overcame this 
problem by sampling, in an appropriate energy range, with an approximation 

wmu{k) = w^^iEC^)) = e-K^'^')^<^'+«(^^'=') (1.5) 

to the weights 

w,/n{k) = w,/„{E(''y) = -^^. (1.6) 

Here the function b{E) is the microcanonical tem,perature at energy E and a{E) 
is some kind of fugacity. Whereas approximations to the weig ht factors'' l/n{E) 
were already used in umbrella sampling [78], the parameterization (1.5) in terms 
of the microcanonical temperature b{E) was introduced in [7] and is typical for the 
multicanonical approach. The function b{E) has a relatively smooth dependence on 
its arguments which makes it a very useful quantity when dealing with the weight 
factors. 

With an approximation WmuiE^'^^) to the weights (1.6) one samples instead of 
the canonical energy distribution P{E) a new multicanonical distribution 

Pmu{E) = Cmu n{E) Wmu{E) ~ Cmu ■ (1-7) 



■^Recently Tsallis [79] weights received also attention [46]. 

*The first detailed discussion of the connection of the multicanonical weights (1.5) with the 
inverse spectral density was given in [9] . 
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The desired canonical distribution (1.2) is obtained by re-weighting 

This relation is rigorous, because the weights Wmu{E) used in the actual simulation 
are exactly known. With the approximate relation (1.7) the average number of 
configurations sampled does not longer depend strongly on the energy and accurate 
estimates of the ratio (1.4) R{L) = Pmin/fmax become possible. Namely for i = 1,2 
the equation 

rf(T\ _ Zf fT \ ^^rnM(-^max) exp(— /j'^ .Eniin) . , p (T\ — ^rnujEmin) „s 

R{L) - Rmu{L) e^v(-B-E ) ""^^ ~ P (E^ ) ^ ' 

"Jmuy-'-'mniJ cJS-p^^ fj -L'max^ muV-'^max/ 

holds and the statistical errors are those of the ratio Rmu times the exactly known 

factor. The errors of Rrnu{L) do not suffer from supercritical slowing down, because 
in the multicanonical simulation i^„iin is about as frequently visited as i^^ax or 
E^ . 

max 

The attentive non-expert will raise the objection that the argument appears to 
be circular. The weights (1.6) are a-priori unknown and would they be known we 

could choose Wmu{E^''^) = Wi/n{E^''^) and have Rmu = Pmu{Emin) / PmuiEmaK) = 

1, i.e. there is no need anymore to perform a simulation as R{L) of equation (1.9) is 
then exactly known. The answer to this is that the multicanonical method consists 
of two steps 

1. Obtain a working estimate €jmu{k) of the weights Wi/n{k). Working estimate 
means that the approximation to (1.6) has to be good enough to ensure 
movement in the desired energy range, but deviations of Pmu{E) from the 
constant behavior (1.7) by a factor of, say, ten are tolerable. 

2. Perform a Markov chain MC simulation with the weights Wmu(k). Canonical 
expectation values are found by re-weighting to the Gibbs ensemble and 
standard jackknife methods [36, 10] allow reliable error estimates. 

For the 2d 10-state Potts model figure 1 reproduces thus obtained multicanon- 
ical and re- weighted canonical energy histograms of ref.[8]. The interface tension 
estimate of ref.[8] is depicted in figure 2. 

Assume a researcher, familiar with the canonical MC method of Metropolis et 
al. [66], likes to get started with multicanonical simulations. For canonical simula- 
tion the weights (1.1) are exactly known. The first stumbling block is now to get 
the weight factors Wmu{k). For small to medium sized discrete systems my recom- 
mendation is to implement the general purpose recursion [17] which is discussed 
in section 3. For first order phase transitions an alternative is to rely on the FSS 
scaling behavior of the energy distribution, see section 5. For large enough systems 
this approach has advantages, but its applicability is limited to systems for which 
a FSS theory of P{E) exist. Some people recommend constrained microcanonical 
MC simulations, which are also shortly sketched at the end of section 5. 

Once the weights Wmu{k) are fixed the simulation is done by Markov chain 
MC in the usual way. The second complication, which is minor compared to the 
first, is that canonical expectation values of physical observables are no longer sim- 
ple arithmetic averages of the generated ("measured") values, but more involved 
equations have to be employed. This expands the programming efforts, but causes 
no real difficulties. There is a major award for this additional work: Multicanon- 
ical simulations allow for immediate estimates of the free energy and the entropy, 
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Figure 1 Multicanonical Pmu{E) versus canonical P{E) energy distribution 
as obtained in ref. [8] for the 2d 10-state Potts model on a 70 X 70 lattice. 

whereas their reconstruction from results of canonical simulations requires tedious 
integration techniques. These issues are reviewed in the next section. 

General properties of the method are discussed in section 4. This includes a 
discussion of its slowing down, static and dynamic aspects of the algorithm, and 
a summary of papers which have reported variants and new progress. In addition 
two related methods are sketched: parallel tempering and random walk algorithms. 
We then turn to applications in the next sections. 

For first order phase transitions the multicanonical method is now well-estab- 
lished and a number of applications are summarized in section 5. Next, it has been 
realized early [11] that multicanonical simulations do not only allow to enhance the 
weights of relevant rare configurations, but can also be employed to improve the dy- 
namics of the Markov process, i.e. its movement through configuration space in the 
presence of free energy barriers. Equation (1.3) is a particularly simple example of 
an explicitly controlled free energy barrier. Of major interest in nowadays research 
are complex systems, characterized by a rough free energy landscape for which an 
explicit parameterization is not known. The most important complex systems are 
presumably proteins for which multicanonical investigations have been pioneered by 
Hansmann and Okamoto [43] . Applications of the multicanonical method to these 
systems are summarized in section 6. The lack of explicit parameterizations of the 
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Figure 2 Interface tensions (1.4) and their L ^ oo extrapolation of ref.[8]. 



free energy barriers makes complex systems difficult to study. For simulations one 
has in principle two strategies: 

1. Enhance the suppressed configurations and move over the barriers. 

2. Move around the barriers. 

Only the first strategy allows for an explicit calculation of barrier heights, because in 
the second configurations with barriers are still not sampled. It is with respect to the 
second strategy, but not the first, that the multicanonical approach is in competition 
with the method of multiple Markov chains [40] (see parallel tempering in section 4) . 
The canonical weights are not changed in the latter method, configurations which 
are rare in the Gibbs ensemble (for instance configurations with interfaces) stay rare 
and the strength of the two methods can be quite distinct, although a considerable 
overlap in the range of potential applications exists. 

Finally, conclusions and outlook are given in section 7. 



2 Preliminaries 

MC simulations of systems described by the Gibbs canonical ensemble aim 
at calculating estimators of physical observables O. For discrete systems (on a 
computer all systems are discrete) the expectations values at temperature (3 = 1/T 
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are defined by 

K 

O = 0(/3) =< >= Z-^ ^^'^^ e"'^^'" (2-1) 

fe=i 

where 

K 

Z = Z{P)=J2e-^''"" (2-2) 



is tlie partition function. The sum k = 1,. . . ,K goes over all configurations (or 

microstatcs) of the system and i?^'^^ is the energy of microstate k. 

Here we focus our discussion on generalized Potts models in an external mag- 
netic field on rf-dimensional hypercubic lattices. Without being overly complicated, 
these models are general enough to illustrate the essential features we are interested 
in. In addition, various subcases of these models are by themselves of considerable 
physical interest. Conceptual generalizations to other models are straightforward, 
but technical complications arise in some cases. 

For generalized Potts models the energy of the system is given by 

^C^) = 4''^ + M^""^ (2-3) 

where 

E'o'' = - E j^:i<i\<^') ^(^f ^) with si,,,,j) ={l'zi7 1 (2-4) 

<ii> ^ ' 

and 

N 
i=l 

In equation (2.4) the sum < ij > is over the nearest neighbour lattice sites and q^''^ 

(k) 

is the state of configuration k at site i. For the g-state Potts model takes on 
the values 1, . . . ,q. The Jij{qi, qj), {qi = 1, . . . ,q; qj = 1,. . . ,q) functions denote 
exchange coupling constants between the states at site i and site j. For 

Jij{qi,qj) = J > (conventionally J = 1) (2.6) 

the original Potts model is recovered and q = 2 becomes equivalent to the Ising 
ferromagnet. Potts glasses [27] and Ising spin glasses [35] are obtained when the 
exchange constants are quenched random variables. Other choices of the Jij include 
anti-ferromagnets and the fully frustrated Ising model [82] . 

The sum in equation (2.5) goes over all N sites i of the lattice. The external 
magnetic field is chosen to interact with the state qi = 1 at each site i, but not 
with the other states. Each configuration (microstate of the system) k defines a 
particular arrangements of all states at the sites and, vice versa, each arrangement 
of the states at the sites determines uniquely a configuration: 

k = {q['\...,qP}. (2.7) 

As there are q possible states at each site, the total number of microstates is 

Z{0) = K = q'' , (2.8) 



8 



Bernd A. Berg 



where we have used the definition (2.2) of Z. Already for moderately sized systems 
(f' is an enormously large number. If L^, (n = 1, . . . , d) are the edge lengths of the 
lattice, the number of sites is given by 

d 

AT = [] L„ , (2.9) 

n=\ 

i.e. N = L'^ for a symmetric lattice with Li = ■ ■ ■ = Lpf = L. 

Metropolis ct al. [66] introduced importance sampling for the Gibbs canonical 
ensemble^. Their original approach imitates the thermal fluctuations of nature 
and weights configurations with the Boltzmann factors (1.1). The implementation 
relies on a Markov process and generalizes immediately to arbitrary weights •w{k). 
Given a configuration k, new configurations k' are proposed with probabilities of 
an explicitly known transition matrix 

p°{k', k) = p\k, k'), Y.p\k', k) = l. (2.10) 
fe' 

The symmetry condition is known as detailed balance and can be replaced by the 
somewhat weaker condition of balance, see for instance [25] . The newly proposed 
configuration k' is accepted with the probability 

p"(fc',fc) = min[l,w(fc',/c)] with €i(k',k) = ^^^^ (2.11) 

w[kj 

and otherwise rejected. Here w(k) can be the Boltzmann factors (1.1), the multi- 
canonical weights (1.5) or any other function of k. Putting equations (2.10) and 
(2.11) together, the transition probability for k ^ k' becomes 

p{k', k) = p°(fc', k) p^ik', k) for k' k and p{k, fc) = 1 - ^ p{k', k) . (2.12) 

k'^k 

Note that rejected proposals k ^ k' have to be counted as /c — > fc transitions. It is 
a well-known beginner's mistake to undercount the at hand configuration k. 

It was first proven in [66] that the procedure (2.12) generates configurations 
k with the desired weights w{k) provided the property of ergodicity holds: Every 
configurations has to be reachable in a finite number of steps. 

For generalized Potts models implementation of the Metropolis algorithm is 
straightforward. The transition matrix p^(k',k) can be defined by the following 
procedure 

1. Pick one site i at random, i.e. with probability 1/A''. 

2. Assign one of the states 1, . . . , g to g-, each with probability 1/q. 

These rules account for 

0/^/ f 1/9 for k' = k, 

^ ' \ (9 — l)/(9-^) for each configuration k' with 7^ at one site i. 

(2.13) 

The new configuration k' is then accepted or rejected according to the rule (2.11). 
Ergodicity is fulfilled, because with (2.13) every configuration can be reached in a 
minimum number of N update steps. The random choice of a site insures detailed 
balance. A similar procedure where states are updated in the systematic order of 



^For an introduction to the canonical ensemble and statistical physics in general see for 
instance [52]. 
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a permutation tti, . . . jTTjv leads still to the desired distribution. Although detailed 
balance is then violated the weaker condition of balance still holds. 

The Metropolis process needs to run some while to establish equilibrium with 
respect to the ensemble defined by the weights with which it samples, see for in- 
stance [25] for an introduction and coupling from the past [74] is a recent rigorous 
approach. In the following we assume that equilibrium has been reached. Subse- 
quently the Metropolis process generates a number of equilibrium configurations fcj, 
i = 1, . . . , n on which observables O are measured similarly as in real experiments. 
Let us focus on the internal energy E. When Boltzmann weights Wsik) (1.1) are 
used the Metropolis process generates E^ = ijC^') values in the canonical ensemble 
and the estimator E oi O = E m (2.1) is simply the arithmetic average 



E 



1 " 

-E^'- (2-14) 



n . , 
1=1 



When other weights are used this is no longer true. After equilibrium is reached, 
a Metropolis process with the weights Wmu{k) (1-5) will generate configurations in 
the multicanonical ensemble. To obtain estimators of canonical expectation values 
(2.1) one has to re-weight to the canonical ensemble and equation (2.14) becomes 



E{P) =Y,E' «;-Ui^^) exp(-/3£;0 / ^ w;;^^{E^) exp(-/3i?0 • (2.15) 




It is instructive to compare the histograms Hi[E) = H0^{E), {i = 1,2) of two 
canonical simulations (/3i < /32) with the histogram HmuiE) generated by a multi- 
canonical Metropolis process. These histograms are estimators of the corresponding 
probability densities P{E) (1.2) and Pmui^) (1-7). A typical situation is depicted 
in figure 3. Histograms from canonical simulations at sufficiently distinct temper- 
atures will not overlap, whereas a histogram from a multicanonical simulation can 
bridge the entire range and allows to calculate canonical expectation values for the 
range /3i < /? < /32 (and, in the case depicted, also for some (3 > (32) ■ 

Obviously the re- weighting (2.15) can only work in a temperature range where 
the multicanonical histogram includes the canonical histogram from a simulation 
at that temperature. For insufficient statistics (or a false choice of multicanonical 
weights) the entries from outside the Hij{E) range may lead to spurious estimates 
of E{P) (2.15). This is an example of a bias problem. For dealing with such 
problems the jackknife method, see for instance [36, 10], is recommended. Already 
for the statistical analysis of results from canonical simulations this should really 
be the method of first choice and as such be taught in introductory books on MC 
simulations. For multicanonical simulations bias problems tend to become more 
subtle and the use of jackknife estimators becomes often a must. 

It is well-known [25] that a canonical MC simulation at temperature /3 = 1/T 
provides estimates for many but not all thermodynamical quantities of interest. Ob- 
tained are estimators for the internal energy E, quantities related to its derivatives, 
like the specific heat 

Cy = -^=P^^=P^ (< £;('=)i?('=) > - < E^"^ X E^"^ >) , (2.16) 

correlation functions, the magnetization and many others. However, canonical sim- 
ulations fails with respect to a number of important quantities, most prominently 
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Figure 3 Simulation of a 2d Ising model on a SO'^ lattice: Canonical his- 
tograms Hi{E) and H2(E) from simulations at /3i = and = 0.4 versus a 
multicanonical histogram Hjnu{E) which includes their ranges. 



the partition function Z{pi) itself and, related to it, for the Helmholtz free energy 

F(/3) = -rMnZ(/3) (2.17) 

and the entropy^ 



S = 



E-F 



(2.18) 



The reason is that the histograms H{E) of canonical sampUng yield the shape of 
the probability density P{E) but do not estimate the normalization constant C/3 
(1.2) which, due to ci)Y^^n{E) exp(— /3iJ) = 1, is 



cp = l/Z{/3) 



(2.19) 



Instead the normalization constant for the generated histogram is simply the inverse 
(arbitrary) number of generated configurations. In contrast to this multicanonical 
simulations allow to calculate Z{P) by exploiting that Z{0) = K = is known 
(2.8). After the multicanonical simulation the estimator for the canonical proba- 
bility density (1.8) is 



P{E) 



C0_ H^ujE) 



-I3E 



(2.20) 



where the superscript " on c!^^ indicates that this is the constant needed when 
using the histogram which relies on n configurations generated in the multicanonical 
ensemble. Assume that the /3-range for which rc-weighting is valid includes /? = 0. 
The constant cj^„ follows then from the known result cq = l/Z{0) = 1/K. Once 
,j is known (and hence the partition function) follows from the normalization 
J2e P{E) — 1' where /? has of course to stay in the admissible re-weighting range. 



^This is the canonical entropy which has to be distinguished from the microcanonical entropy 
used in the next section. 
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3 Multicanonical Recursion 

In this section a recursion [17] is presented which allows to obtain working esti- 
mates Wmu{k) of the weight factors (1.6). Essentially, it is a simplified and improved 
version of a recursion which was first published in [16] after being extensively tested 
in spin glass simulations [15]. 

It is recommended to start the recursion in the disordered phase of a system, 
where the system moves freely under MC updates. The zeroth simulation is done 
with the weights 

w°(fc) = 1 for aU fc. (3.1) 

The most obvious recursion towards the weights (1.6) goes as follows: Simulation 
n, (n = 0, 1, 2, ...) is carried out with the estimate w^{k) and yields the histogram 
H'^{E). Estimate n + 1 for the weight factors is then given by 



This simple-minded approach fails due to a number of difficulties 

1. What to do with histogram entries H"'{E) = or small? 

2. Each H"(E) calculation starts with zero statistics. Assume, someone has 
given us the exact weights (1.6) and we use w'^{k) = Wi/„(fc): The next 
estimate w^(A;) will be worse. A noisy left-over of wi/„(fc). 

3. The initial weights ufl{k) = 1 correspond to temperature infinity and are 
bad in the limit E^''^ Eg, where Eg is the groundstate of the systems 
which is approached in the zero temperature limit. 

The recursion derived in the following overcomes these difficulties. Let us first 
discuss the relationship [7, 16] of the weights (1.5) with the microcanonical tem- 
perature b{E) and the fugacity a{E), because it turns out to be advantageous to 
formulate the recursion in terms of these quantities. We have 

w{k) = e-^^^""^ = e-HE^'")^""+-(^"") (3.3) 

where S{E) is the microcanonical entropy and, by definition. 

This determines the fugacity function a{E) up to an (irrelevant) additive constant: 
We consider the case of a discrete minimal energy e and choose 

b{E) = [S{E + e) - S{E)] /e . (3.5) 

The identity S{E) = b{E) E - a{E) imphes 

S{E) - S{E - e) = b{E)E - b{E - e){E - e) - a{E) + a{E - e) . 

Inserting e b{E - e) = S{E) - S{E - e) yields 

a{E - e) = a{E) + [b{E - e) - b{E)] E (3.6) 

and a{E) is fixed by defining a(i^max) = 0. In summary, once b(E) is given, a{E) 
follows for free. The starting condition (3.1) becomes (other b°{E) choices are of 
course possible) 

6°(S) =a°(i;) =0. (3.7) 
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To avoid H{E) = we replace for the moment 

H{E) H{E) = max [ho, H{E)] , (3.8) 

where < ho < 1. Our final equations will allow for the limit ho 0. With this 
replacement we translate equation (3.2) into an equation for b{E). Subscripts o are 
used to indicate that those quantities are not yet our final estimators from the n*'' 
simulation. Let 

<+i(i;) = e-^o"-(^) = c^, 

where the (otherwise irrelevant) constant c is introduced to ensure that Sq~^^{E) 
can be an estimator of the microcanonical entropy. It follows 

SS+\E) = -In c + S''{E)+\nH''{E) . (3.9) 

Inserting this relation into (3.5) gives 

b'^^+HE) - b'^iE) + [lni?"(£; + e) - lnff"(£)]/e (3.10) 

The estimator of the variance of bQ'^^(E) is obtained from 

a^[b^+\E)] = a^[b"{E)] + a^[\nH'\E + e)]/e + <j^[\nH''{E)]/e . 

Now cr^[6"(£^)] = as b^{E) is the fixed function used in the n*'' simulation and 
the fluctuations are governed by the sampled histogram if" = H'^{E) 

0-2 [ln(^")] = <T2[ln(fl"")] = [ln(fl'" + Aff") - ln(fl"")]2 

where AH" is the fluctuation of the histogram, which is known to grow with the 
square root of the number of entries AH" ~ \fH^ . Hence, 

holds where c' is an unknown constant and this equation emphasizes that the vari- 
ance is infinite when there is zero statistics, i.e. H"{E) = or H"{E + e) = 0. 
The statistical weight for 6q+^(-E) is inversely proportional to its variance and the 
over-all constant is irrelevant. Choosing a convenient over-all constant we get 

^o(E)- HHE + e)+H"{E)- ^^'^^^ 

Note that gl^{E) = for H"'{E + e) = or H"'{E) = 0. The n*^ simulation was 
carried out using b"{E). It is now straightforward to combine bQ^^{E) and b"{E) 
according to their respective statistical weights into the desired estimator: 

b"+\E) = r{E) b"{E) + g^{E) b^+\E) , (3.13) 

where the normalized weights 

9o{E) = and ^"(E) = 1 - g^^iE) 

are determined by the recursion 

g"+\E)=g"{E)+g^{E), 5°(i;) =0. (3.14) 

We can eliminate 6q~''^(£^) from equation (3.13) by inserting its definition (3.10) 

and get 

b"+^{E) = b"{E) + g^{E) X [\nH"{E + e) - \nH"{E)]/e (3.15) 
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Figure 4 Recursion &" {E) for tlie 10-state Potts model on an 80^ lattice. 



Notice that it is now save to perform the limit 



in the definition of H (3.8). Namely, g^{E) = 
and 

5^ X lim [lniJ"(£;- 

/to— >0 



for H'^iE) = or + e) = 



■e) -lnJ?"(£;)] 



is well-defined for all i?"(£;). 

In contrast to the naive recursion (3.2) the recursion (3.15) includes the entire 
information assembled. Frequent iterations are allowed, implying increased stability 
and decreased CPU time consiimption. In addition it is recommended to implement 
a suitable b{E) guess for the not yet covered E i^min energy values. Figure 4 
depicts how the recursion works for the 2d 10-state Potts model (magnetic field H = 
0) on an 80^ lattice. The straight horizontal lines indicate the h{E), E ^ Eg, guess 
used after the corresponding recursion step and the noise at their off-spring points 
comes from the last recursions done before the snapshot was taken. Subsequent 
recursion steps were separated by one thousand MC sweeps. 

Finally, equation (3.15) can be converted into a direct recursion for ratios of 
the weight factor neighbors. We define 



w"{E) 
w"(£; + e) 



and get [18] 



R^'+^iE) = R'^iE) 



H'^iE + e) 
H-^{E) 



(3.16) 



(3.17) 



This version is of interest when RAM limitations are an issue. 
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4 Properties of the Algorithm and Miscellaneous Topics 

Before we come to the two major fields of applications, first order phase tran- 
sitions and complex systems, it is helpful to summarize a number of algorithmic 
issues. A reader who is not interest in more technical points may want to skip this 
section. 

4.1 Slowing down. Our typical situation is 

E — E ■ V 

^max -'-^min * 

The optimum for a flat energy distribution is given by a random walk in the en- 
ergy [7]. This implies a CPU time increase 

to keep the number of i^max £"111111 ^'max transitions constant. The recursion 
(3.15) needs an additional ~ V^-^ (optimum) attempts to cover the entire range. 
It follows: 

slowing down ~ V^'^ or worse. 

For first order phase transitions (see the next section) a recursion alternative is 
patching of overlapping constraint MC simulations: 

number of (fixed size) patches ~ V . 

When results can be obtained by keeping the number of updates per spin (sweeps) 
in each patch constant, another CPU factor ~ V follows. In this case we can get: 

optimal performance . 

However, in practice this approach may face ergodicity problems. 

4.2 Static and dynamic aspects of the algorithm. The actual perfor- 
mance of the algorithm tends to be close to the optimal performance, when one 
focuses on static applications. Here static means that some canonically rare con- 
figurations have to be enhanced. In dynamical applications one uses the method 
to by-pass rare configurations without actually enhancing then. Whether the slow- 
ing down is still close to the optimum or not depends then on whether a suitable 
parameterization of the problem can be found and for complex systems this is an 
open issue. 

Static Examples: 

• Magnetic field driven first-order phase transitions: Configurations with zero 
(or small) magnetic fields arc exponentially suppressed at low temperatures 
and they exhibit domain walls. 

• Temperature driven first-order phase transitions: Configurations with do- 
main walls are exponentially suppressed. 

• These rare configurations can be enhanced [8, 12]. 
Dynamic Examples: 

• The following transitions my be induced by multicanonical simulations which 

include the high temperature region [11]: 

• Low temperature transitions between magnetic states (for instance the up- 
down states of the Ising model below the Curie temperature, ...). 

• Transition between low temperature states in systems with confiicting con- 
straints: Spin glasses, proteins, the traveling salesman problem, ... 
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4.3 Variants of the multicanonical methods. Multicanonical lefeis to cal- 
culations of canonical expectation values for a temperature range and re-weighting 
has to be done in the internal energy. Similarly, other physical quantities can be 
considered, e.g. multimagnetical [12] refers to simulations which give results for 
a certain range of the magnetic field. A variant for cluster updates due to Janke 
and Kappler [58] is called multibondic. Recently the multi-overlap algorithm was 
introduced [19] which focuses on the Parisi order parameter for spin glasses. All 
these algorithms fall into the general class of umbrella sampling methods, which 
knows similar distictions, like temperature-scaling MC [78] and the later intro- 
duced density-scaling MC [80]. 

4.4 Technical progress. A number of papers have reported further progress 

along the lines of multicanonical algorithms. For asymmetric distributions Borgs 
and Kappler [30] suggest to use an equal weight instead of the equal height criterium 
employed in equation (1.4). Hesselbo and Stinchcombe [50] made an attempt to 
optimize the weights and propose 



instead of (1-6). Their arguments cover the static situation (pertinence), but as one 
has no a-priori control over the dynamics of the systems, the question of optimal 
weights remains one of trial and error. Combining multicanonical with multigrid 
methods has been explored by Janke and Saucr [57]. For molecular dynamics, 
Langevin and hybrid MC variants see Hansmann et al. [44] and, with emphasize 
on lattice gauge theory, Arnold ct al. [2]. Connections with adaption and linear 
response theory were explored in [71] and optimization of estimators is discussed 
in [38]. Occasionally attempts have been made to use bivariate multicanonical 
weighting [51], with most spectacular results claimed recently [48]. Concerning 
the latter reference, it appears to be too early for a final judgement. In general 
numerical methods tend to become unstable when the parameter space becomes 
too big. 

4.5 Parallel tempering. The developments sketched above have to be distin- 
guished from the the parallel tempering approach which is for some applications the 
major competing method and should by no means be confused with multicanonical 
sampling. 

The first paper on parallel tempering is due to Geyer [40] who introduces it 
as the method of multiple Markov chains. Independently it was developed in other 
papers. The method of expanded ensembles was introduced by Lyubartscv ct al. [65] 
and proposes to enlarge the configuration space by introducing new dynamical 
variables. The better known simulated tempering [67] method of Marinari and 
Parisi can be considered as the special case where the temperature becomes the 
new dynamical variable. Building on simulated tempering the method of parallel 
tempering was introduced and tested by Hukusima and Nemoto [53]. It is identical 
with Gcyer's [40] proposal and well suited for clusters of workstations and massively 
parallel computer architectures (although a moderately fast network between the 
nodes is sufficient). 

Parallel tempering performs n canonical MC simulations at different /3-values 
with Boltzmann weight factors 




(4.1) 



E'<E 




= 1 



,...,n, (3i < (32 < ... < /3n-i 



< A 



•n 
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and allows exchange of neighbouring /3-values 

A-i^ — >bifori = 2,...,n. (4.2) 
These transitions lead to the energy change 

AE = ) - - ) - A-i^l!?) 

= (ft-A_i) (i^P (4.3) 

which is accepted or rejected according to the Metropolis algorithm. The /3j-values 
have to be determined such that a reasonably large acceptance rate is obtained 
for the P exchange (4.2) and ref.[53] employs a recursive method due to Kerler 
and Rehberg [61]. This is similar to the recursion needed at the beginning of a 
multicanonical simulation. 

Remark: The method works for dynamical but not for statical supercritical 
slowing down. Each member of the discrete set of weight factors samples still a 
Boltzmann distribution {e.g. it is not a valid method for calculating interfacial 
tensions). 

4.6 Random Walk Algorithms. In ref. [14] a class of algorithms was de- 
signed to perform a random walk in the energy (or any other function of the mi- 
crostates). Assume the Metropolis proposal probabilities p°(fc', k) (2.10) are defined 

which allow for A^f'^) moves out of configuration k (for the generalized Potts model 
probabilities (2.13) we have A''^'^) = (q — 1)^ independently of k). We may divide 
the N^''^ moves into three classes: 

]\f{k) = _/v+ + Ar° + A/-- (4.4) 

with 

1. N+ moves with AE, > 0, z = 1. . . . , N+. 

2. N° moves with AEi = 0, i = N+ + 1, . . . , N+ + N° . 

3. N- moves with AEi < 0, i = N+ + + 1, . . . , A^C^). 

The observation of [14] is that it is easy to define transition probabilities pf, 
p°i andpr, 

N+ N+ + N° JV"") 

such that 

N+ JV"") 

4=1 i=Ar+ + Aro_|-i 

holds (but at the extrema). For any such choice the algorithm performs (away from 
the extrema) a random walk in the energy and, hence, samples a broad energy 
distribution. 

In [14] configuration dependent transition probabilities 

p+(fc), p°(fc) and p+{k) (4.6) 

were used, where the dependence on the configurations k exceeds a mere energy 
dependence. The algorithm sacrifices then an exact relationship with the canonical 

ensemble in favor of having a-priori well-defined defined transition probabilities and 
was conjectured to be of interest for optimization problems, where one focuses on 
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minima and not in the canonical ensemble. With this in mind the name random 
cost algorithm was chosen in [14] . 

Wang's [83] recent random walk algorithm is a special case of equations (4.5), 
but uses microcanonical averages 

pt{E), p°{E) and p+{E) . (4.7) 

The configuration dependence (4.6) is then reduced to energy dependence and 
canonical expectation values can be recovered by using the equation 

n{E) N{E, AE) = n{E + AE) N{E + AE, ~AE) (4.8) 

where N(E, AE) is the microcanonical average for the number of transitions from 
configurations with energy E to configurations with energy E' = E + AE. For 
proofs of (4.8) see the appendix of [20] or ref.[72]. 

It seems [83] that in the random walk MC the use of estimators for N{E, AE) 
instead of their unknown exact values faces more serious problems than in multi- 
canonical simulations. Therefore, the best application of equation (4.8) might be to 
employ it as input in the iteration towards the multicanonical weights (1.5). Such 
a use of transition probabilities was first proposed by Smith and Bruce [76, 77] 
and apparently deserves renewed attention. 

5 First Order Phase Transitions 

Multicanonical simulations arc best established for investigations of first-order 
phase transitions. The range of applications goes from studies of mathematically 
ambitious topics to chemistry oriented ones. For instance an investigation [23] 
of the mathematically rigorous Borgs-Kotecky [28] FSS theory reveals that very 
strong phase transitions or very large lattices are needed to observe the asymptotic 
behavior predicted by their theory. To give two examples from the chemistry side, 
an investigation of the coexistence curve of the Lcnnard- Jones fluid was performed 
in ref.[84, 54] and liquid- vapor transitions in fluids were studied in ref.[85, 86]. 
The Lennard-Jones fluid has also been a major field of applications for the original 
umbrella sampling method and variants thereof [78, 34, 81]. 

A lot of work has focused on calculations of interfacial tensions and an overview 
is given in the forthcoming. 

5.1 2d Potts Models. The pioneering study was performed for the 2d 10- 

statc Potts model [8] and 2/" = 0.0978(8) was found through FSS study of the 
equation (1.4). Only afterwards the exact value was discovered to be 2/* = 
0.094701... [29]. Once, the exact result was known, the remaining, small discrep- 
ancy could be eliminated by improving the finite volume estimators [24] . For these 
simulations the slowing down is around ~ V^-^, i.e. reasonably close to the optimal 
performance. For a related investigation of the 7-state 2d Potts model see [56]. 

5.2 2d and 3d Ising Model. Many real physical systems fall into the uni- 
versality class of the 3d Ising model. Despite its simplicity, it is therefore a very 
rewarding object to study. Although many of its universal parameters have already 
been determined with high precision, others are still in the making. In particular, 
there has been interest in the universal surface tension and the critical-isotherm 
amplitude ratios [87]. To obtain them, one needs accurate interfacial tension re- 
sults below the Curie temperature and multimagnetical simulations [12, 13] have 
become the enabling technique for Binder's [26] histogram method, which was orig- 
inally proposed in this context. 
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For the 2d Ising model Onsager's exact result was reproduced with good accu- 
racy [12]. However, for the 3d model the temperature dependence of the interfacial 
tension [13] has, unfortunately, come out erratic. Therefore, the results of ref.[47] 
appear to be the up-to-date best estimates. Considerable technical improvements 
of multimagnetical calculations are nowadays feasible and it seems worthwhile to 
start off a new multimagnetical 3d Ising model simulation. 

5.3 SU{3) Gauge Theory. One is interested in the interfacial tension for the 
confinement/deconfinement phase transition. The use of multicanonical techniques 
has been explored by Grossmann, Laursen et al. [41, 42]. In particular, they 
noticed that it is suitable to use an asymmetric lattice, V = L^L^Lt with > 3L. 
This forces the interfaces into the plane and ensures a flat region for the minimum 
of equation (1.3), thus greatly facilitating the extraction of finite-lattice values for 
the interfacial tension and, consequently, the FSS analysis. 

For SU{3) gauge theory the interfacial tension is usually denoted by the symbol 
a and estimates are conveniently given as multiples of = (Tc)^, where Tc is the 
deconfinement temperature. Using the conventions of [55], the estimates of [42] 
are 

<T = 0.052(4) Tj*, {Lt = 2) and <t = 0.020(2) T^, {Li = 4) 

This may be compared with the later estimate by Iwasaki ct al. [55] 

a = 0.02925(22) T^, {Lt = 4) and a = 0.0218(33) T^, [Lt = 6) 

The discrepancy (only Lt — 4, can be compared) is presumably due to too small 
lattice sizes in [42]. Physically, one is interested in the Lj ^ oo limit. Possibly 
the strong Lt dependence can be eliminated by using tadpole improved actions for 
which Beinlich et al. [6] report 

a = 0.0155(16) T^, [Lt = 3 and = 4) . 

5.4 Electroweak Phase Transition. Baryon violating processes are unsup- 
pressed for T > Tc, where Tc is the electroweak critical temperature. It has been 
conjectured, that this may allow to explain the baryon asymmetry in nature. Mod- 
els tie the nucleation rate to the interface tension of the transition. Using an 
effective scalar field theory [60] or the full theory [33], multicanonical and related 
techniques turn out to be useful for simulations at a Higgs mass 

rriH « (35-37) GeV , 

where one deals with a relatively strong first-order transitions, as needed to explain 
the baryon asymmetry. Unfortunately, it turn out that the transition weakens for 
higher Higgs masses, see ref.[69] for a concise review. 

5.5 Weight factor estimates. For spin systems with first-order phase transi- 
tions the FSS behavior is relatively well-known. Provided the steps between system 
sizes are not too large, it is then possible to get working estimates of the Wmu{E) 
weights by incians of a FSS extrapolation from the already simulated smaller systems 
[8, 13]. Another method which works for these systems is patching of overlapping, 
constraint [22] MC simulations. This has been employed by various groups, but 
documentation has remained sketchy, presumably it is best in [33]. It should be 
emphasized that the purpose of the constrained MC simulations is here to get a 
working estimate of the multicanonical weights, whereas in [22] they were used 
for final estimates of physical observables. The advantage of the multicanonical 
simulation is that it has not the ergodicity problems from which microcanonical 
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and constraint MC simulations tend to suffer. While these approaches work for 
first order phase transitions, they fail as soon as ergodicity problems become severe 
what is normally the case for complex systems. Then it is necessary to employ 
a recursion like the one of section 3, which is a convenient choice for first order 
transitions too. 

6 Complex Systems 

Soon after the introduction of multicanonical methods, their potential relevance 
for investigations of spin glasses and other complex systems was pointed out [11]. In 
these systems one encounters large free energy barriers due to disorder and frustra- 
tions. Multicanonical simulations try to overcome the barriers through excursions 
into the disordered phase. Examples are spin glasses, proteins, hard optimization 
problems and others. 

For an efficient application of the multicanonical idea one needs some some 
physical insight: The configurations of interest have to be identified and a suitable 
parameterization has to be found which allows their actual enhancement. Whereas 
for first order phase transition the appropriate parameters (temperature, magne- 
tization, etc.) are quite obvious, it is a major open problem whether suitable 
parameters exist for classes of complex systems. 

6.1 Spin glasses. Multicanonical studies have so far been limited to the 
simplest, realistic prototypes, the 2d and 3d Edwards- Anderson ±J Ising^ spin 
glass [11, 15]. Significant progress has been achieved with respect to groundstate 
energy and entropy calculations. In 3d the groundstate energy and entropy per spin 
(cg = Eg/N and Sg = Sg/N) estimates of [15] are 

eg = -1.8389 ± 0.0040 and Sg = -1.7956 ± 0.0042 . 

However, on the algorithmic side the slowing down with volumes size is very bad, 
around or, possibly, exponential. Certain advantages of simulated tempering 
are claimed in [61] and [53], but these studies fail to present comparisons under 
identical conditions and presently there is no firm evidence that simulated tem- 
pering yields a significantly better slowing down. These studies weight spin glass 
configurations with the inverse spectral density (1.6). Recently a more focused ap- 
plication of the multicanonical method to spin glass simulations was developed [19]. 
The idea is to explore the free-energy structure in the Parisi order parameter such 
that the barriers in this variable can be mapped out. These investigations are in 
progress. Marinari, Parisi and collaborators perform major spin glass simulations 
using parallel tempering, for a review see [68] . 

6.2 Proteins. Proteins are linear polymers with the 20 naturally occurring 
amino acids as monomers. Chains smaller than a few tens of amino acids are 
called peptides. The problem is to predict the folded conformation of proteins and 
peptides solely from their amino acid sequence. For many years the emphasis of 
numerical investigations has been on finding the global minimum potential energy 
and the major difficulty encountered is the multiple minima problem. Molecular 
dynamics has been the numerical method of first choice, but the fraction of sto- 
chastic investigations shows an increasing trend. In particular simulated annealing 
has been frequently used. 



This is J = or 1 in the 2-state Potts model notation. 
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Figure 5 Example configurations from a multicanonical simulation of poly- 
alanine [45] (courtesy Ulricii Hansmann). 

The major advantage of multicanonical and related methods in the context of 
proteins is that they allow for investigations of the thermodynamics of the entire free 
energy landscape of the protein. This was realized by Hansmann and Okamoto [43] 
when they introduced multicanonical sampling to the problem of protein folding 
and, slightly later, by Hao and Scheraga [49]. Since then numerous applications 
were performed and the simulations have been quite successful for peptides, but 
face tremendous problems concerning scaling to large systems. A particularly nice 
example is depicted in figure 5: The folding of poly-alanine into its a-helix coil [45]. 
No a-priori information about the groundstate conformation is used in these kind 
of simulations. By now a quite extensive literature exists which is compiled in [46]. 

6.3 Optimization problems. They occur in engineering, network and chip 
design, traffic control, and many other situations. General purpose algorithms for 
their solution are simulated annealing and genetic algorithms. To those, we may 
now like to add multicanonical annealing [64] and random cost [14]. 

Multicanonical annealing is a combination of multicanonical sampling with vari- 
able upper bounds and frequent adaption of parameters. A promising study [64] 
has been performed for the traveling salesman problem. Up to A*" = 10, 000 cities, 
randomly distributed in the unit square, were considered and scaling of the path 
length as function of A^ was investigated. The reported algorithmic performance is 
so good, that an independent confirmation would be desireable. 

Recently the random cost algorithm (4.5) has been applied to topology op- 
timization in the engineering of trusses [5]. The performance was found to be 
competitive to that of evolutionary (genetic) algorithms. 
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7 Outlook and Conclusions 

Sampling of broad energy distributions allows to overcome supercritical slow- 
ing down. This is well established for first-order phase transitions. Systems with 
conflicting constraints remain, despite some progress, notoriously difiicult and for 
them most hope lies on achieving further algorithmic improvements. In addition, 
multicanonical methods may also be of interest when dealing with second order 
phase transitions, but so far nobody really cared to investigate this direction. 

Multicanonical simulations have the potential to be replace canonical simu- 
lations as the method of first choice for exploratory MC studies. Their major 
advantage is that they give the thermodynamics for the entire temperature range 
of interest in one single MC run, whereas several to many (for each /3-value one) 
canonical MC runs are needed. So far the major stumbling block against using 
multicanonical simulations in this way has been the lack of easily available imple- 
mentations of recursion schemes and data analysis programs. The author thinks 
that these shortcomings can be overcome. 
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